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Abstract — MDS (maximum distance separable) array codes 
are widely used in storage systems due to their computationally 
efficient encoding and decoding procedures. An MDS code with 
r redundancy nodes can correct any r erasures by accessing 
(reading) all the remaining information in both the systematic 
nodes and the parity (redundancy) nodes. However, in practice, 
a single erasure is the most likely failure event; hence, a natural 
question is how much information do we need to access in order 
to rebuild a single storage node? We define the rebuilding ratio 
as the fraction of remaining information accessed during the 
rebuilding of a single erasure. In our previous work we showed 
that the optimal rebuilding ratio of 1/r is achievable (using 
our newly constructed array codes) for the rebuilding of any 
systematic node, however, all the information needs to be accessed 
for the rebuilding of the parity nodes. Namely, constructing array 
codes with a rebuilding ratio of 1/r was left as an open problem. 
In this paper, we solve this open problem and present array codes 
that achieve the lower bound of 1/r for rebuilding any single 
systematic or parity node. 

I. Introduction 

MDS (maximum distance separable) array codes are a 
family of erasure-correcting codes used extensively as the 
basis for RAID storage systems. An array code consists of 
a 2-D array where each column can be considered as a disk. 
We will use the term column, node, or disk interchangeably. A 
code with r parity (redundancy) nodes is MDS if and only if 
it can recover from any r erasures. EVENODD |j2i and RDP 
151 are examples of MDS array codes with two redundancies. 
In this paper, we only consider systematic codes, namely, the 
information is stored exclusively in the first k nodes, and the 
parities are stored exclusively in the last r nodes. 

In order to correct r erasures, it is obvious that one has to 
access (or read) the information in all the surviving nodes. 
However, in practice it is more likely to encounter a single 
erasure rather than r erasures. So a natural questions is: How 
much information do we need to access when rebuilding 
a single erasure? Do we have to access all the surviving 
information? We define the rebuilding ratio as the ratio of 
accessed information to the remaining information in case of 
a single erasure. For example, it is easy to check that for 
the code in Figure [U if any two columns are erased, we can 
still recover all the information, namely, it is an MDS code. 
However, if column Ci is erased, it can be rebuilt by accessing 
^0,2/^1,2 from column C2, ?'0/''i from column C3, and Zg, Zi 
from column C4, as follows: 
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Figure 1. An MDS array code with two systematic and two parity nodes. All 
the elements are in finite field F3. The first parity column C3 is the row sum 
and the second parity column C4 is generated by the zigzags. For example, 
zigzag 20 contains the elements fl; ; that satisfy fj{i) = 0. 
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Here all elements are in finite field F3. Hence, by accessing 
only half of the remaining information, the erased node can be 
rebuilt. Details on this new code will be discussed in Section 
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A related problem called repair bandwidth was first pro- 
posed in ||6l. The paradigm there is that one can access the en- 
tire information and perform computations within each node, 
and the question is how much information is transmitted for 
rebuilding? A lower bound on the repair bandwidth was given 
in ||6]. When a single erasure occurs and all the remaining 
nodes are accessible, the lower bound for the bandwidth is 
i . Recently, a number of codes were designed to achieve the 
bandwidth lower bound. When the number of parity nodes 
is larger than that of the systematic nodes, explicit code 
constructions were given in JSl- lfTOl . For all cases, Q, ifTTI 
achieved the lower bound asymptotically. 

It is clear that a lower bound on the repair bandwidth is also 
a lower bound on the rebuilding ratio. In IIT2I we presented 
an explicit construction of MDS array codes that achieve the 
lower bound on the ratio for rebuilding any systematic node. 
A similar code construction was given in ^. Also in |7] a 
similar code with 2 parities was proposed - it has optimal 
repair bandwidth for any single erasure. 

The main contribution of this paper is an explicit construc- 
tion of MDS array codes with r parity nodes, that achieves 



the lower bound 1/r for rebuilding any systematic or parity 
node. The rebuilding of a single erasure has an efficient 
implementation as computations within nodes are not required. 
Moreover, our codes have simple encoding and decoding 
procedures - when r = 2 and r = 3, the codes require finite- 
field sizes of 3 and 4, respectively. 

The rest of the paper is organized as follows. Section |ll] 
introduces the rebuilding ratio problem for MDS array codes 
and reviews the code construction in il2il . Section Ullldescribes 
the construction of our codes with optimal rebuilding ratio. 
Finally, the paper is summarized in Section |IV] 

II. Rebuilding Ratio Problem 

In this section we formally define the rebuilding ratio 
problem and review the code construction in lfT2l . We then 
prove that the construction can be made an MDS code, in 
fact, this will be the basis for proving that our newly proposed 
construction which is described in Section Hill is also an MDS 
code. 

We first define the framework of a systematic MDS array 
code. Let A = (fl,- ,) be an information array of size p x q. A 
column is also called a node, and an entry is called an element. 
Each of the q columns is a systematic node in the code. We 
add r parity columns to this array on the right, such that from 
any q columns, we can recover the entire information. In lfT2l . 
it was shown that if each information element is protected by 
exactly r parity elements, then each parity node corresponds to 
q permutations acting on [0, p — 1]. More specifically, suppose 
the permutations are fi,f2,---,fq- Then the f-th element in 
this parity node is a linear combination of all elements «, , 
such that /;(/) = t. The set of information elements contained 
in this linear combination is called a zigzag set. For the f-th 
element in the /-th parity, t e [0, p — 1], / e [0, r — 1], denote 
by /{,■■ .,fq the set of associated permutations, and Z| the 
zigzag set. 

The ordering of the elements in each node can be arbitrary, 
hence, we can assume that the first parity node is always 
a linear combination of each row (corresponding to identity 
permutations). Figure [T] is an example of such codes. The first 
parity C3 corresponds to identity permutations. The second 
parity C4 corresponds to the permutations 



fl 



(2,3,0,1), 
(1,0,3,2). 



For a given MDS code with parameters q,r, we ask what 
is the accessed fraction in order to rebuild a single node (in 
the average case)? Hence, the rebuilding ratio of a code is: 



-,(;+)■ 



R 



^i=l (* accessed elements to rebuild node /) 
{q + r){# remaining elements) 



When a systematic node is erased, we rebuild each unknown 
element by one of the parity nodes. That is, we access one 
parity element containing the unknown, and access all the 
elements in the corresponding zigzag set except the unknown. 
In order to lower the number accesses, we would like to 



find (i) good permutations such that the accessed zigzag sets 
intersect as much as possible, and (ii) proper coefficients in the 
linear combinations such that the code is MDS. For example, 
in Figure [T] in order to rebuild column Ci, we access the 
zigzag sets A = {Zq,Zj}, B = {Zq,Z|}, corresponding to 
parities {/"o, J"!}, {zo,Zi}. The surviving elements in A and in 
B are identical, i.e., {flo,2/''l,2}' therefore, only 1/2 of the 
elements are accessed. Besides, the coefficients {1,2} in the 
parity linear combinations guarantee that any two nodes are 
sufficient to recover all the information. Hence the code is 
MDS. 

Next we review the construction with optimal rebuilding for 
systematic nodes that was presented in lfT2l . The idea in the 
code construction was to form permutations based on r-ary 
vectors. 

Let 61,62, ■.. ,ej^ be the standard vector basis of 'ZJf. We 
will use X to represent both an integer in [0, r — 1] and its 
r-ary expansion (the r-ary vector of length k). It will be clear 
from the context which meaning is used. All the calculations 
are done over Z^ 

Construction 1 Let the information array be of size r x k. 
Define permutation /■ on [0, r*^ — 1] as fUx) = x + 16:, 

i e [l,k],l e [0,r- 1]. For t e [0,r'' - 1], we define the 
zigzag set Z\ in parity node I as the elements aij such that their 
coordinates satisfy fUi) = t.LetYj = {x G [0, r*^ — 1] : 
X ■ 6; = 0}. Rebuild column j by accessing rows Y, in all 
remaining columns. 

Theorem 1 Construction\l\has optimal ratiol/r for rebuilding 
any systematic node 1IT2I? . 

Figure [T] is an example of Construction [T] As mentioned 
before, only 1/2 of the information is accessed in order to 
rebuild Ci. The accessed elements are in rows Yi = {x G 
[0,3] :x-ei = 0} = {0,l}. 

Next, we show that by assigning the coefficients in the 
parities properly, the code is MDS. Let Pj = (fl,/) be the 
permutation matrix corresponding to /. = fj, namely, fl,- / = 1 
if 1+6: = i, and «/ ; = otherwise. Assigning the coefficients 
is the same as modifying fl, ; = 1 to other non-zero values. 
When r = 2, 3, modify fl,y = 1 to Uji = c, if I ■ I^j^j £; = 0, 
where c is an primitive element of F3, F4, respectively. The 
above assignment will make the code MDS for r = 2, 3 lfT2l . 
For example, the coefficients in Figure [T] is assigned in this 
way. 

When r > 4, modify all fl,y = 1 to fl,y = A,-, for some Ay 
in a finite field F. Let the generator matrix of the code be 
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The following theorem shows that under this assignment the 
code can be MDS. 

Theorem 2 (1) Construction\l\can be made an MDS code for 
a large enough finite field. 

(2) When r = 2, 3, field of size 3 and 4 is sufficient to make the 
code MDS. 

Proof: Part (2) was given in |fT2l . We only prove part 
(1). An MDS code means that it can recover any r erasures. 
Suppose t systematic nodes and r — t parity nodes are erased, 
1 < t < r. Thus suppose we delete from G' the system- 
atic rows {iiiji, ■ ■ ■ ,it} and the remaining parity nodes are 
{fl,f2, . . -lit}- Then the following txt block matrix should 
be invertible: 
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Its determinant det(G) is a polynomial with indeterminates 



A;. 



. -r^u- All terms have highest degree r [ii + 



it). 



One term with highest degree is ns=i ^^f with non-zero 
coefficient 1 or —1. So det(G) is a non-zero polynomial. 
Up to now we only showed one possible case of erasures. 
For any r erasures, we can find the corresponding non-zero 
polynomial. The product of all these polynomials is again a 
non-zero polynomial. Hence by JT] for a large enough field 
there exist assignments of {A.} such that the polynomial is 
not 0. Then each G is invertible, and the code is MDS. ■ 

III. Code Construction 

The code in lfT2l has optimal rebuilding for systematic 
nodes. However, in order to rebuild a parity node, one has 
to access all the information elements. In this section we 
construct MDS array codes with optimal rebuilding ratio for 
rebuilding both the systematic and the parity nodes. The code 
has k — 1 systematic nodes and r parities nodes, for any k, r. 

Consider the permutation /, = /^ in Construction [T] It is 
clear that /, is a permutation of order r, i.e., /[ is the identity 
permutation. For i G [0, r — 1], define X, as the set of vectors 
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Figure!. Parity matrices A' for r = 1 (left) and r = 3 (rigiit) parities. 
Wlien tire first parity node is erased, the underlined elements are accessed 
from systematic nodes. The remaining unknown elements are recovered by 
the shaded elements from parity nodes. 
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where p, corresponds to the mapping of /, : X, ^> X,+i. In 
particular, if p, is viewed as a permutation acting on Xg, then 
for X E Xq, 

Pj{x) = X + ej — ej.. 

When r = 2,3, modify the 1 entries of p, into c if its 
corresponding column / satisfies / ■ YJt=i ^f ~ 0. Here c is 
an primitive element in F3, F4. When r > 4, modify 1 entries 
into Av. 

In the following, we will use blocks the same as single 
elements. When referring to row or column indices, we mean 
block row or column indices. We refer to p, as a small block, 
and the corresponding block row or column as a small block 
row or column. And P, is called a big block with big block 
row or column. Moreover, we assume the elements in each 
column are in order (Xg, . . . , X,-_i). 



of weight i, namely, X, = {v e Z^ : v ■ (1, . . . , 1) = /}. Xq Construction 2 Suppose the information array is of size r*^ 



is a subgroup of Z^ and X/ = Xg + iej^ is its coset, where 
^k = (0, ■■-,0, 1). Assume the elements in X, are ordered, 
/ G [0, r — 1], and the ordering is 

Xq = {vi,...,V,.k-i), 

Xi = {vi+ief;,...,v^k-i+ie^). 

Since the ordering of the elements in each column does not 
matter, we can reorder them as (Xg, Xj, . . . , X,.„i), with each 
Xj ordered as above. One can check that /i(X,) = Xj+i, 
where the subscript is added mod r. So the matrix P, can be 
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[l,k — 1], define a big block matrix 
( 
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where a 7^ 0, 1 is an element of the finite field and is multiplied 
to the diagonal in rows 1, . . . , [jj . And define A', by cyclicly 
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Figures. An MDS an'ay code with two systematic and two parity nodes by 
Construction [2] Tlie finite field used is F3 . Tlie sliaded elements are accessed 
to rebuild the first parity node. 



shifting the rows and columns ofA^ to the right and bottom by 
i positions: 
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where fi = a orl. Ifx — i < j orx — i = j, ' < j, 
a is multiplied to the diagonal in row x. Construct the code as 
follows. Let the first k — 1 nodes be systematic, and the last 
r nodes be parities. Parity i is defined by A\, . . ., A^^. The 
generator matrix is 
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Sometimes we will omit the subscript ] when it is not 
important, and the superscript is computed mod r. 

Example 3 For two and three parities, the matrices A' are 
shown in Figure^ When r = 2, as finite field F^ is used, we 
can take a = 2 7^ 1. Coefficient a. = 2 is multiplied to only the 
second diagonal in A^. When r = 3, finite field F4 is used and 
we choose some a 7^ 0, 1. We multiply oc to one diagonal block 
in each A'. An example of a code with 2 parities is shown in 
Figure^ 

Next we show that the code in Construction |2] has optimal 
ratio. We first observe that in A', the x-th row is 
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)> 



where the values above are the column indices and omitted 
blocks are all zero. Here [i = a. if x — i < j 01 x — i = 
2, / < J, and /5 = 1 otherwise. Therefore, suppose i' — i < j 



or i' — i = 2/' < J, then the f'-th row in A' and the f-th row 
in A' are the same except for the coefficients: 
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J'-i 
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(3) 



/' in A' / ■ ■ ■ p' ' ■ ■ ■ ap' 
i in A'' \- ■ ■ p''^' ■ ■ ■ p'" 

Theorem 4 The code has ratio 1/r for rebuilding any node. 

Proof: Systematic rebuilding: w.l.o.g. assume column 
ei is erased. Access equations Y = {v (z Z^ : v ■ e^ = 
0} from each parity. We first show that all the unknowns 
{xq, . . . ,Xj.k_i) in column e\ are solvable from these equa- 
tions. For all I E Y, Xi is contained in equation 

Xi 

because of the small row block [■■■/■••]. Notice that Y is a 
subgroup of Z^, and Y — fej. = Y for any f G [0, r — 1]. For 
any / G Y, suppose / G Y n Xf for some /', so / + (f — i')ei^ G 

Y n X, for all i G [0, r — 1]. In ^ consider row / in A' and 
row / + (;' — z')ej. in A' , and write t = i' — i < [jj. Then 
we have equations 

for some coefficients a. ^ 0,1 and b,c j^ 0. These equations 
are obviously independent. Moreover since I + t{ei — e^) G 

Y + tei, we can solve unknowns indexed 

Hence all unknowns are solvable. 

Next we show that the fraction of elements accessed in the 
remaining columns is 1/r. For a parity node A', only rows Y 
are accessed, which is a fraction of 1/r. The corresponding 
columns in A' of theses equations are accessed from the sys- 
tematic nodes. For a surviving systematic node j G [2,k — 1] 
and parity i, by definition of p', rows Y in A' are mapped 
to columns Y' = Y + /(ejt — e,) + sej^ for some s. However, 
Y' is a coset of Y and since /(ej- — e,) + sej^ G Y, we have 
Y' = Y. Thus only elements with indices Y are accessed from 
each node. 

Parity rebuilding: Since the parities are all symmetric, 
w.l.o.g. suppose the first parity is erased. Access Xq from each 
node, which is the set of vectors of weight 0. Need to show 
this is sufficient to recover 

A = [Aj,. . .,A,^_J, 

where A^ is defined in Construction |2] Since Xq is sent from 
the systematic nodes, the 0-th column in each big block is 
known, and we can remove them from the equations. By (|3]l, 
from parity i' we can access row 

[...^■••p-'...^...p-'---], 

where the underlined elements are known from the systematic 
nodes and can be treated as 0. Here /5' is 1 or a. Multiplying 
this row by /5, we can rebuild the z'-th row of A: 

[...p(.../3p-'---pr--/5pr'---i 



where /5/3' = a and i' = 1,2, . . . ,r — 1. The 0-th row is rebuilt 
from the systematic nodes directly. Thus the erased node is 
rebuilt by accessing Xg, which is 1/r of the elements. ■ 

Example 5 Consider the code with two or three parities in 
Figure |2] When the first parity node is erased, one can access 
Xq from the systematic nodes, and the underlined elements are 
known. Then access the shaded elements from the surviving 
parity nodes. It is easy to see that the first parity can be rebuilt 
from the accessed elements. 

For the specific example of Figure [3] when the first sys- 
tematic node is erased, one can access rows 0, 1, 2, 3 from all 
surviving nodes. When the first parity node is erased, one can 
access rows 0, 3, 5, 6 from all the remaining nodes (the shaded 
elements). Then it is easy to check that in both cases it is 
sufficient to rebuild the erased column. 

Next we show the construction is indeed an MDS code. 
We prove this by reducing this problem to the fact that 
Construction [T] is MDS. First we make an observation on the 
small blocks. 

Lemma 6 Construction\l\is MDS iff any t x t sub block matrix 
of 
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is invertible, for all f G [1/ ?"] ■ 

Proof: Consider the t x t sub block matrix of H': 
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We showed in Theorem|2]that Construction[T]is MDS iff any G 
in ^ is invertible. W.l.o.g. suppose {z'l, . . ■ ,if} = {0, . . .,t — 
1}, {ji, . . . ,/f } = {!,..., t}. By (|2]i, G can be rewritten as 
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where each big block is composed of r x r small blocks. We 
can see that the shaded small blocks are the only non-zero 
blocks in their corresponding rows and columns, and they form 



the sub-matrix H. Therefore G being invertible is equivalent 
to H and the remaining sub-matrix both being invertible. 
Moreover the remaining sub-matrix has a similar form as G 
and we can again find t rows and t columns corresponding to 
H. Continue this we get 

det(G) 7^ ^ (det(H))'' 7^ ^ det(H) ^ 0. 

The same conclusion holds for any sub matrix of H'. Thus 
completes the proof. ■ 

The method of taking out sub block matrices to compute the 
determinant as above is also used in the proof of the following 
theorem, which shows that Construction |2] is indeed an MDS 
code. 

Theorem? If the coefficients in the linear combinations of 
the parities are chosen such that Construction^ is MDS, then 
Construction^is also MDS. 

Proof: Similar to Theorem |2] Construction |2] being MDS 
means any of the following matrix is invertible: 



'4 



.^;; 



^n 



^n. 



where t e [l,r],l = {ii,...,it] Q [0,r_- l],{h,. . .,jt} C 
[l,k- 1]. Let the complement of Z be I = [0,r - 1]\I. In 
each big block consider the small block column x E I. Only 
small block rows x in each big block are non-zero. Thus we 
can take out this t x t sub block matrix: 



/SiP- 



Ptp]' 



hn 
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where {/5/} are 1 or a. But by Lemma |6] the above matrix 
is invertible. So we only need to look at the remaining sub 
matrix. Again, we can take out another small block column and 
row from I from each big block, and it is invertible by Lemma 
|6] Continue this process, we are left with only columns and 
rows of I in each big block. For all i, i' El,l<i' — i<j 
or i' — i = 2,i < J, consider row i' in A' and row i in A' . 
They are shown in (O. One can do row operations and keep 
the invertibility of the matrix, and get 
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Proceed this for all /, i' E I, we are left with block diagonal 
matrix in each big block and the matrix left is of size f'^ x f^. 
Taking out the z'l-th column and row in each big block, we 
have the following t x t sub matrix: 



^'2-'l 



-'t-'l 






Pn 






which is invertible by Lemma |6] Similarly, we can take out the 
f2-th column and row, and so on, and each sub matrix is again 
invertible. Thus, any matrix A is invertible and Construction 
|2]isMDS. ■ 

For example, one can easily check that the code in Figure 
[3] is able to recover the information from any two nodes. 
Therefore it is an MDS code. 

IV. Summary 

In this paper, we presented constructions of MDS array 
codes that achieve the optimal rebuilding ratio 1/r, where 
r is the number of redundancy nodes. The new codes are 
constructed based on our previous construction in lfT2ll and 
improve the efficiency of the rebuilding access. 

Now we mention a couple of open problems. For example, 
if there are k — 1 systematic nodes and r parity nodes, then 
our code has r rows. Namely, the code length is limited, are 
there codes that are longer given the number of rows? For 
example, when r = 2, we know an optimal rebuilding ratio 
construction with r rows and k systematic nodes: 



Al 



Here A^,A^ are the matrices that generate the parities, and 

we can take all / G [1/^]- On the other hand, given r rows, 
it can be proven that any systematic and linear code with 
optimal ratio has no more than k + 1 systematic nodes. Thus 
the proposed code length can be improved by at most 2 nodes. 
Finally, using the code in |[T2l one is able to rebuild any 
e,l < e < r, systematic erasures with an access ratio of e/r. 
However, it is an open problem to construct a code that can 
rebuild any e erasures with optimal access. 
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